AITopics | acceleration strategy

Collaborating Authors

acceleration strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accelerating Augmentation Invariance Pretraining Jinhong Lin Cheng-En Wu

Neural Information Processing SystemsFeb-17-2026, 08:59:45 GMT

In ImageNet, our method achieves speedups of 4 in MoCo, 3.3 in SimCLR, and 2.5 in DINO, demonstrating substantial efficiency gains.

acceleration strategy, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

How Efficient Are Diffusion Language Models? A Critical Examination of Efficiency Evaluation Practices

Peng, Han, Liu, Peiyu, Dong, Zican, Cheng, Daixuan, Li, Junyi, Tang, Yiru, Wang, Shuo, Zhao, Wayne Xin

arXiv.org Artificial IntelligenceNov-11-2025

Diffusion language models (DLMs) have emerged as a promising alternative to the long-dominant autoregressive (AR) paradigm, offering a parallelable decoding process that could yield greater efficiency. Yet, in practice, current open-source DLMs often underperform their AR counterparts in speed, limiting their real-world utility. This work presents a systematic study of DLM efficiency, identifying key issues in prior evaluation methods. Through empirical benchmarking and a theoretical analysis, we demonstrate that AR models generally achieve higher throughput, while DLMs consistently lag. We also investigate acceleration strategies, finding that techniques like dual cache and parallel decoding mainly offer gains at small batch sizes, with their benefits diminishing upon scaling. Our findings underscore the necessity of robust evaluation methods and improved acceleration strategies to advance research on DLMs.

large language model, natural language, throughput, (17 more...)

arXiv.org Artificial Intelligence

2510.1848

Country:

Asia (0.46)
North America > United States > California (0.28)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)

Add feedback

Accelerating Augmentation Invariance Pretraining Jinhong Lin Cheng-En Wu

Neural Information Processing SystemsOct-10-2025, 12:55:04 GMT

In ImageNet, our method achieves speedups of 4 in MoCo, 3.3 in SimCLR, and 2.5 in DINO, demonstrating substantial efficiency gains.

acceleration strategy, representation, training budget, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

dVLA: Diffusion Vision-Language-Action Model with Multimodal Chain-of-Thought

Wen, Junjie, Zhu, Minjie, Liu, Jiaming, Liu, Zhiyuan, Yang, Yicun, Zhang, Linfeng, Zhang, Shanghang, Zhu, Yichen, Xu, Yi

arXiv.org Artificial IntelligenceOct-1-2025

Vision-Language-Action (VLA) models are emerging as a next-generation paradigm for robotics. We introduce dVLA, a diffusion-based VLA that leverages a multimodal chain-of-thought to unify visual perception, language reasoning, and robotic control in a single system. For practical deployment, we mitigate inference latency by incorporating two acceleration strategies--a prefix attention mask and key-value (KV) caching--yielding up to 2 speedup at test-time inference. We evaluate dVLA in both simulation and the real world: on the LIBERO benchmark it achieves state-of-the-art performance with a 96.4% average success rate, consistently surpassing both discrete and continuous action policies; on a real Franka robot, it succeeds across a diverse task suite, including a challenging bin-picking task that requires multi-step planning, demonstrating robust real-world performance. Together, these results underscore the promise of unified diffusion frameworks for practical, high-performance VLA robotics. The development of VLA models has undergone two stages of evolution. In the first stage, a pre-trained vision-language backbone is used purely as a feature extractor, and the extracted features are mapped directly to robot actions. As vanilla VLA architectures proved inadequate for open-world instruction following and long-horizon tasks, a second-stage training paradigm co-trains on image-text data alongside action trajectories to preserve knowledge from the pre-trained VLM and, when necessary, to predict both sub-step reasoning and robot actions (Zhou et al., 2025b;a; Intelligence et al., 2025b; Driess et al., 2025).

artificial intelligence, arxiv preprint arxiv, dvla, (15 more...)

arXiv.org Artificial Intelligence

2509.25681

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments

Du, Yipeng, Wang, Zihao, Farhan, Ahmad, Angione, Claudio, Yang, Harry, Johnston, Fielding, Buban, James P., Colangelo, Patrick, Zhao, Yue, Yang, Yuzhe

arXiv.org Artificial IntelligenceAug-14-2025

The deployment of large-scale models, such as large language models (LLMs), incurs substantial costs due to their computational demands. To mitigate these costs and address challenges related to scalability and data security, there is a growing shift towards decentralized systems for model deployment, where choosing efficient inference acceleration schemes become crucial to manage computational resources effectively and enhance system responsiveness. In this work, we address the challenge of selecting optimal acceleration methods in decentralized systems by introducing a meta-learning-based framework. This framework automates the selection process by learning from historical performance data of various acceleration techniques across different tasks. Unlike traditional methods that rely on random selection or expert intuition, our approach systematically identifies the best acceleration strategies based on the specific characteristics of each task. We demonstrate that our meta-learning framework not only streamlines the decision-making process but also consistently outperforms conventional methods in terms of efficiency and performance. Our results highlight the potential of inference acceleration in decentralized AI systems, offering a path towards more democratic and economically feasible artificial intelligence solutions.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.09194

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A Triple-Inertial Accelerated Alternating Optimization Method for Deep Learning Training

Yan, Chengcheng, Xu, Jiawei, Wang, Qingsong, Peng, Zheng

arXiv.org Artificial IntelligenceMar-13-2025

The stochastic gradient descent (SGD) algorithm has achieved remarkable success in training deep learning models. However, it has several limitations, including susceptibility to vanishing gradients, sensitivity to input data, and a lack of robust theoretical guarantees. In recent years, alternating minimization (AM) methods have emerged as a promising alternative for model training by employing gradient-free approaches to iteratively update model parameters. Despite their potential, these methods often exhibit slow convergence rates. To address this challenge, we propose a novel Triple-Inertial Accelerated Alternating Minimization (TIAM) framework for neural network training. The TIAM approach incorporates a triple-inertial acceleration strategy with a specialized approximation method, facilitating targeted acceleration of different terms in each sub-problem optimization. This integration improves the efficiency of convergence, achieving superior performance with fewer iterations. Additionally, we provide a convergence analysis of the TIAM algorithm, including its global convergence properties and convergence rate. Extensive experiments validate the effectiveness of the TIAM method, showing significant improvements in generalization capability and computational efficiency compared to existing approaches, particularly when applied to the rectified linear unit (ReLU) and its variants.

activation function, algorithm, dataset, (13 more...)

arXiv.org Artificial Intelligence

2503.08489

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Hunan Province (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Accelerating Augmentation Invariance Pretraining

Lin, Jinhong, Wu, Cheng-En, Wei, Yibing, Morgado, Pedro

arXiv.org Artificial IntelligenceOct-30-2024

Our work tackles the computational challenges of contrastive learning methods, particularly for the pretraining of Vision Transformers (ViTs). Despite the effectiveness of contrastive learning, the substantial computational resources required for training often hinder their practical application. To mitigate this issue, we propose an acceleration framework, leveraging ViT's unique ability to generalize across inputs of varying sequence lengths. Our method employs a mix of sequence compression strategies, including randomized token dropout and flexible patch scaling, to reduce the cost of gradient estimation and accelerate convergence. We further provide an in-depth analysis of the gradient estimation error of various acceleration strategies as well as their impact on downstream tasks, offering valuable insights into the trade-offs between acceleration and performance. We also propose a novel procedure to identify an optimal acceleration schedule to adjust the sequence compression ratios to the training progress, ensuring efficient training without sacrificing downstream performance. Our approach significantly reduces computational overhead across various self-supervised learning algorithms on large-scale datasets. In ImageNet, our method achieves speedups of 4$\times$ in MoCo, 3.3$\times$ in SimCLR, and 2.5$\times$ in DINO, demonstrating substantial efficiency gains.

acceleration strategy, representation, training budget, (12 more...)

arXiv.org Artificial Intelligence

2410.22364

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The Impact of Inference Acceleration Strategies on Bias of LLMs

Kirsten, Elisabeth, Habernal, Ivan, Nanda, Vedant, Zafar, Muhammad Bilal

arXiv.org Artificial IntelligenceOct-29-2024

Last few years have seen unprecedented advances in capabilities of Large Language Models (LLMs). These advancements promise to deeply benefit a vast array of application domains. However, due to their immense size, performing inference with LLMs is both costly and slow. Consequently, a plethora of recent work has proposed strategies to enhance inference efficiency, e.g., quantization, pruning, and caching. These acceleration strategies reduce the inference cost and latency, often by several factors, while maintaining much of the predictive performance measured via common benchmarks. In this work, we explore another critical aspect of LLM performance: demographic bias in model generations due to inference acceleration optimizations. Using a wide range of metrics, we probe bias in model outputs from a number of angles. Analysis of outputs before and after inference acceleration shows significant change in bias. Worryingly, these bias effects are complex and unpredictable. A combination of an acceleration strategy and bias type may show little bias change in one model but may lead to a large effect in another. Our results highlight a need for in-depth and case-by-case evaluation of model bias after it has been modified to accelerate inference.

acceleration strategy, inference acceleration strategy, preprint, (15 more...)

arXiv.org Artificial Intelligence

2410.22118

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
North America > United States (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.48)
Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Hyperbolic Graph Attention Network

Zhang, Yiding, Wang, Xiao, Jiang, Xunqiang, Shi, Chuan, Ye, Yanfang

arXiv.org Machine LearningDec-6-2019

Graph neural network (GNN) has shown superior performance in dealing with graphs, which has attracted considerable research attention recently. However, most of the existing GNN models are primarily designed for graphs in Euclidean spaces. Recent research has proven that the graph data exhibits non-Euclidean latent anatomy. Unfortunately, there was rarely study of GNN in non-Euclidean settings so far. To bridge this gap, in this paper, we study the GNN with attention mechanism in hyperbolic spaces at the first attempt. The research of hyperbolic GNN has some unique challenges: since the hyperbolic spaces are not vector spaces, the vector operations (e.g., vector addition, subtraction, and scalar multiplication) cannot be carried. To tackle this problem, we employ the gyrovector spaces, which provide an elegant algebraic formalism for hyperbolic geometry, to transform the features in a graph; and then we propose the hyperbolic proximity based attention mechanism to aggregate the features. Moreover, as mathematical operations in hyperbolic spaces could be more complicated than those in Euclidean spaces, we further devise a novel acceleration strategy using logarithmic and exponential mappings to improve the efficiency of our proposed model. The comprehensive experimental results on four real-world datasets demonstrate the performance of our proposed hyperbolic graph attention network model, by comparisons with other state-of-the-art baseline methods.

graph, hyperbolic space, representation, (14 more...)

arXiv.org Machine Learning

1912.03046

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > West Virginia (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Stochastic L-BFGS: Improved Convergence Rates and Practical Acceleration Strategies

Zhao, Renbo, Haskell, William B., Tan, Vincent Y. F.

arXiv.org Machine LearningOct-24-2017

We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By proposing a new framework for the convergence analysis, we prove improved convergence rates and computational complexities of the stochastic L-BFGS algorithms compared to previous works. In addition, we propose several practical acceleration strategies to speed up the empirical performance of such algorithms. We also provide theoretical analyses for most of the strategies. Experiments on large-scale logistic and ridge regression problems demonstrate that our proposed strategies yield significant improvements vis-\`a-vis competing state-of-the-art algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/TSP.2017.2784360

1704.00116

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback